Hybrid Text Mining for Finding Abbreviations and their Definitions
نویسندگان
چکیده
We present a hybrid text mining method for finding abbreviations and their definitions in free format texts. To deal with the problem, this method employs pattern-based abbreviation rules in addition to text markers and cue words. The pattern-based rules describe how abbreviations are formed from definitions. Rules can be generated automatically and/or manually and can be augmented when the system processes new documents. The proposed method has the advantages of high accuracy, high flexibility, wide coverage, and fast recognition.
منابع مشابه
A Joint Semantic Vector Representation Model for Text Clustering and Classification
Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...
متن کاملFinding abbreviations in biomedical literature: three BioC-compatible modules and four BioC-formatted corpora
BioC is a recently created XML format to share text data and annotations, and an accompanying input/output library to promote interoperability of data and tools for natural language processing of biomedical text. This article reports the use of BioC to address a common challenge in processing biomedical text information-that of frequent entity name abbreviation. We selected three different abbr...
متن کاملTranslation of Acronyms, Initialisms and Abbreviations (AIA) in Persian Political and Sport Journalistic Texts
The different writing systems of English and Persian makes translation of acronyms, initialisms and abbreviations challenging. This study aimed at finding which strategies were applied most frequently in translating acronyms, initialisms and abbreviations from English to Persian especially in journalistic texts. The study was done based n Descriptive Translation Study of Toury and strategies pr...
متن کاملA review of text mining approaches and their function in discovering and extracting a topic
Background and aim: Four text mining methods are examined and focused on understanding and identifying their properties and limitations in subject discovery. Methodology: The study is an analytical review of the literature of text mining and topic modeling. Findings: LSA could be used to classify specific and unique topics in documents that address only a single topic. The other three text min...
متن کاملOntology based Text Mining of Concept Definitions in Biomedical Literature
Many developers of biomedical knowledge bases typically validate and update formalized knowledge based on reviews of full-text scientific articles, but finding text relevant to domain concepts can be tedious and prone to errors. Prior methods have automated this process by matching term-based patterns within a single sentence. In our work developing a knowledge base of autism phenotypes, specif...
متن کامل